Skip to content

Enable Client Side Access Logs for SD#2955

Merged
istio-testing merged 4 commits intoistio:masterfrom
gargnupur:nup_add_client_log
Aug 11, 2020
Merged

Enable Client Side Access Logs for SD#2955
istio-testing merged 4 commits intoistio:masterfrom
gargnupur:nup_add_client_log

Conversation

@gargnupur
Copy link
Contributor

What this PR does / why we need it:
Enable Client Side Access Logs for SD

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Special notes for your reviewer:
Will need sampling for client side logs too.. will add in a follow-up PR.

Release note:

@gargnupur gargnupur requested review from a team, bianpengyuan, kyessenov and mandarjog July 31, 2020 21:57
@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Jul 31, 2020
@istio-testing istio-testing added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 31, 2020
@douglas-reid
Copy link
Contributor

Do we want a way to control whether or not we want both server and client access logs in config?

inline bool StackdriverRootContext::enableServerAccessLog() {
return !config_.disable_server_access_logging() && !isOutbound();
inline bool StackdriverRootContext::enableAccessLog() {
return !config_.disable_server_access_logging();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we want more config around enabling client access logging and, relatedly, we need to assess whether or not disable_server_access_logging should be renamed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just add disable_client_access_logging ? changing disable_server_access_logging might cause compatibility issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added disable_client_access_logging

(*label_map)["destination_name"] =
flatbuffers::GetString(local_node_info.name());
void Logger::fillDestinationLabels(
const ::Wasm::Common::FlatNode& node_info,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: destination_node_info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}

void Logger::fillSourceLabels(
const ::Wasm::Common::FlatNode& node_info,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: source_node_info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

destination_version: v1
protocol: http
log_sampled: "true"
upstream_cluster: "outbound|9080|http|server.default.svc.cluster.local"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this entry not mention any trace related information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not sending any request with trace headers... will add a test for that in a separate pr

@gargnupur
Copy link
Contributor Author

Do we want a way to control whether or not we want both server and client access logs in config?

@douglas-reid : I think it we be good to have a separate configs for both? @bianpengyuan , @mandarjog : wdyt?

@mandarjog
Copy link
Contributor

Per discussion

  1. separate config options
  2. bool always_log: default=false
  3. bool skip_if_server_logs default=true

or just one bool?

@gargnupur
Copy link
Contributor Author

Per discussion

  1. separate config options
  2. bool always_log: default=false
  3. bool skip_if_server_logs default=true

or just one bool?

Need 2 in my mind..
so we will have following access logs configs:

  1. disable_server_access_logging -> default = true
  2. enable_client_access_logging -> default = false
  3. skip_client_log_if_server_logs -> default = true (=> logs when MX fails)
    /cc @mandarjog , @bianpengyuan , @douglas-reid , @kyessenov

@gargnupur gargnupur force-pushed the nup_add_client_log branch from a1b0092 to 66f2075 Compare August 5, 2020 21:26
@gargnupur
Copy link
Contributor Author

@bianpengyuan, @douglas-reid , @kyessenov, @mandarjog : This is ready for review..

bool disable_client_access_logging = 10;

// Optional. Controls whether to export client access log if MX with server
// fails and there was an error in request/connection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to document what this means in the face of disable_client_access_logging.

It naively seems like this is ignored if disable_client_access_logging is true. And if disable_client_access_logging is false, this seems like it then would filter out error logs if false.

But I don't believe that is actually how it works.

Can we clarify?

google.protobuf.BoolValue enable_log_compression = 9;

// Optional. Controls whether to export client access log.
bool disable_client_access_logging = 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, I feel like maybe we should move to a Logging enum (partially because I don't like disable booleans, and partially because we now have multiple options).

Maybe something like:

enum AccessLogging {
  NONE = 0;
  SERVER_ONLY = 1;
  SERVER_AND_CLIENT_ERRORS = 2;
  SERVER_AND_CLIENT = 3;
  CLIENT_ERRORS_ONLY = 4;
  CLIENT_ONLY = 5;
}

AccessLogging access_logging = 10;

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea as it saves from problems of multiple combinations of booleans... we would just have to match with configuring it in istio via values properly..

how about this?
`
enum AccessLogging {
NONE = 0;
FULL_SERVER_ONLY = 1;
FULL_SERVER_AND_CLIENT_ERRORS_ONLY = 2;
FULL_SERVER_AND_FULL_CLIENT = 3;
CLIENT_ERRORS_ONLY = 4;
FULL_CLIENT_ONLY = 5;
}

AccessLogging access_logging = 10;
`

just to clarify enum option 2 above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FULL seems to be confusing to me. It makes sense to me that SERVER_ONLY means log all request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL now..

@gargnupur
Copy link
Contributor Author

@bianpengyuan, @douglas-reid , @kyessenov, @mandarjog : This is ready for review..

enum AccessLogging {
// No Logs.
None = 0;
// All logs only including both success and error logs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// All logs only including both success and error logs.
// All logs including both success and error logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


inline bool StackdriverRootContext::enableServerAccessLog() {
return !config_.disable_server_access_logging() && !isOutbound();
return (!config_.disable_server_access_logging() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does logger need to distinguish enablement of server or client log? Since we will configure inbound and outbound filter differently, so enablement in logger should be direction agnostic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direction will matter if suppose we want to enable only server logs.. but stackdriver filter will be added in both directions for metrics...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we are going to have different configurations for inbound and outbound filter, right? Each logger/root context should be either inbound or outbound, so in terms of enablement, I think it does not need to know about directions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.. that's why I didn't have it in my initial pr... currently it's needed for errors only case as we are enabling it for client only.. I will refactor this part once we make a decision on configs...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed...

FULL = 1;
// All error logs when no metadata is available of the peer. This is
// currently only available for outbound/client side logs.
ERR_ONLY_ON_NO_MX = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an unnecessarily subtle condition, which relates too much to the implementation details. IMO we should offload log decision to the sample filter. Ideally log sample filter could support CEL and flexibly decide whether to log based on presence of any header including mx ones. Considering log sample filter does not support CEL for now, I think it is fine to log every errors, and we can always further optimize it by adding CEL support in sample filter in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If offloading to log sample filter is the agreed approach, do we need to make this an enum? Shall we keep it as boolean (or a BoolValue if disable field name is undesired) and rename it to something without server in it, like enable_access_log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offloading to sample filter would involve change in sampling filter and this option was added to fasttrack client side logs.. so that we don't enable client side and server logs by default causing increase in logs cost.

I totally agree with offloading the decision to log to sampling filter rather than stackdriver filter. In case of http, we can do on presence of header and for tcp on presence of filter state..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging whether to log based on presence of mx data seems to be the ultimate way of reducing log volume and not sure if that is really necessary for initial release. I think only logging errors (which log sample filter already supports) already seems to be a good budget choice. IMHO we should start with this simple option and avoid complexity. @mandarjog @kyessenov @douglas-reid wdyt?

Copy link
Contributor

@douglas-reid douglas-reid Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so: { NONE, ALL, ERR_ONLY } ? Seems fine to me...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated based on offline conversation. Changed to Err_Only to log on errors.

log_entries_request->mutable_resource()->CopyFrom(monitored_resource);
}

void Logger::initializeInboundLogEntryRequest(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is basically the same as the following one and has quite some logic in it. Shall we combine them and parameterize it by direction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done..

Comment on lines +95 to +105
void fillDestinationLabels(
const ::Wasm::Common::FlatNode& node_info,
google::protobuf::Map<std::string, std::string>* label_map);

// Helper methods to fill source Labels.
void fillSourceLabels(
const ::Wasm::Common::FlatNode& node_info,
google::protobuf::Map<std::string, std::string>* label_map);

// Helper method to set monitored resource.
void setMonitoredResource(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these methods fillDestinationLabels, fillSourceLabels, setMonitoredResource, and GetLogEntryType do not need to be part of logger class? Should we move it out to an anonymous namespace in logger.cc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know started with anonymous but somehow moved to logger.. thanks, changed back to anonymous!
GetLogEntryType can't be moved as it's accessing private member of Logger class

@gargnupur
Copy link
Contributor Author

/test test_proxy

// Types of Access logs to export.
enum AccessLogging {
// No Logs.
None = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: caps ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks..Fixed

FULL = 1;
// All error logs when no metadata is available of the peer. This is
// currently only available for outbound/client side logs.
ERR_ONLY_ON_NO_MX = 2;
Copy link
Contributor

@douglas-reid douglas-reid Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so: { NONE, ALL, ERR_ONLY } ? Seems fine to me...

Copy link
Contributor

@douglas-reid douglas-reid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concerns have been addressed. Removing my block.

FULL = 1;
// All error logs. This is
// currently only available for outbound/client side logs.
ERR_ONLY = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: slight preference for fully spelling this out ERRORS_ONLY.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure..fixed!

// All logs including both success and error logs.
FULL = 1;
// All error logs. This is
// currently only available for outbound/client side logs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we add the detail from the discussion around response flags and codes here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@istio-testing istio-testing added needs-rebase Indicates a PR needs to be rebased before being merged and removed needs-rebase Indicates a PR needs to be rebased before being merged labels Aug 11, 2020
Fix fmt

Fix fmt

Fix test

Added config options and test for the same

Fixed after rebase

Fixed config and added another test case

Run fmt

Change from ERR_ONLY to ERR_ONLY_ON_NO_MX

Fixed based on feedback

Updated config

Updated config

Fixed based on feedback
Copy link
Contributor

@bianpengyuan bianpengyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, only some minor comment

// either by a timer or by request size limit. Returns false if there is no
// log entry to be exported.
bool flush();
void flush(Logger::LogEntryType log_entry_type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this method return boolean to follow the existing one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no status to return there...it's more of a helper method..

log_entries_request->set_log_name("projects/" + project_id_ + "/logs/" +
log_name);

std::string resource_type = outbound ? Common::kPodMonitoredResource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const std::string&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cant change to const ref as we assigning resource_type based on cluster name key below...

log_entries_request_map_[log_entry_type]->size = 0;
auto log_entries_request =
log_entries_request_map_[log_entry_type]->request.get();
std::string log_name = outbound ? kClientAccessLogName : kServerAccessLogName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const std::string&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Fixed!

!config_.disable_http_size_metrics());
if (enableServerAccessLog() && shouldLogThisRequest(request_info)) {
if ((enableAllAccessLog() ||
(enableClientAccessLogOnError() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think we need to distinguish client or not. This can just be enableAccessLogOnError. Direction is already distinguished at filter configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are saying that we enable access log on error on client side only... thus the distinction...
I think the code will work for server side too if we remove the distinction.. would be a good add to server side logging... @mandarjog, @douglas-reid wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think “logOnError” will suffice.
We will not set it in server side logs config for now, but will work just as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

@bianpengyuan bianpengyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@istio-testing istio-testing merged commit 3b0c7f2 into istio:master Aug 11, 2020
@gargnupur gargnupur deleted the nup_add_client_log branch August 12, 2020 22:04
@gargnupur gargnupur added the cherrypick/release-1.7 Set this label on a PR to auto-merge it to the release-1.7 branch label Aug 27, 2020
@istio-testing
Copy link
Collaborator

In response to a cherrypick label: #2955 failed to apply on top of branch "release-1.7":

Applying: Enable Client Side Access Logs for SD
Using index info to reconstruct a base tree...
M	extensions/common/context.h
M	extensions/stackdriver/stackdriver.h
M	test/envoye2e/stackdriver_plugin/stackdriver_test.go
Falling back to patching base and 3-way merge...
Removing testdata/stackdriver/server_tcp_access_log_entry_on_no_mx.yaml.tmpl
Auto-merging test/envoye2e/stackdriver_plugin/stackdriver_test.go
CONFLICT (content): Merge conflict in test/envoye2e/stackdriver_plugin/stackdriver_test.go
Auto-merging extensions/stackdriver/stackdriver.h
Auto-merging extensions/common/context.h
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Enable Client Side Access Logs for SD
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

@istio-testing
Copy link
Collaborator

In response to a cherrypick label: new issue could not be created for failed cherrypick: status code 410 not one of [201], body: {"message":"Issues are disabled for this repo","documentation_url":"https://docs.github.com/v3/issues/"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherrypick/release-1.7 Set this label on a PR to auto-merge it to the release-1.7 branch cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants